Prior Gradient Mask Guided Pruning-Aware Fine-Tuning

نویسندگان

چکیده

We proposed a Prior Gradient Mask Guided Pruning-aware Fine-Tuning (PGMPF) framework to accelerate deep Convolutional Neural Networks (CNNs). In detail, the PGMPF selectively suppresses gradient of those ”unimportant” parameters via prior mask generated by pruning criterion during fine-tuning. has three charming characteristics over previous works: (1) network A typical pipeline consists training, and fine-tuning, which are relatively independent, while utilizes variant as guide without complicated criteria. (2) An excellent tradeoff between large model capacity fine-tuning stable convergence speed obtain final compact model. Previous works preserve more training information pruned pursue better performance, would incur catastrophic non-convergence for rates, our greatly stabilizes phase gradually constraining learning rate parameters. (3) Channel-wise random dropout impose some noise further improve robustness Experimental results on image classification benchmarks CIFAR10/ 100 ILSVRC-2012 demonstrate effectiveness method various CNN architectures, datasets rates. Notably, ILSVRC-2012, reduces 53.5% FLOPs ResNet-50 with only 0.90% top-1 accuracy drop 0.52% top-5 drop, advanced state-of-the-art negligible extra computational cost.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-Pruning: Joint Fine-Tuning and Compression of a Convolutional Network with Bayesian Optimization

When approaching a novel visual recognition problem in a specialized image domain, a common strategy is to start with a pre-trained deep neural network and fine-tune it to the specialized domain. If the target domain covers a smaller visual space than the source domain used for pre-training (e.g. ImageNet), the fine-tuned network is likely to be overparameterized. However, applying network prun...

متن کامل

Regularization with a Pruning Prior

We investigate the use of a regularization prior and its pruning properties. We illustrate the behavior of this prior by conducting analyses both using a Bayesian framework and with the generalization method, on a simple toy problem. Results are thoroughly compared with those obtained with a traditional weight decay. Copyright 1997 Elsevier Science Ltd.

متن کامل

extremal region detection guided by maxima of gradient magnitude

a problem of computer vision applications is to detect regions of interest under dif- ferent imaging conditions. the state-of-the-art maximally stable extremal regions (mser) detects affine covariant regions by applying all possible thresholds on the input image, and through three main steps including: 1) making a component tree of extremal regions’ evolution (enumeration), 2) obtaining region ...

Mask-cost-aware ECO routing∗

In this paper, we study a mask-cost-aware routing problem for engineering change order (ECO). By taking into account old routes for possible reuse, we present an approach for the problem. Encouraging experimental results are reported to demonstrate the effectiveness of our approach.

متن کامل

New Yield-Aware Mask Strategies

In this paper, we provide new yield-aware mask strategies to mitigate emerging variability and defectivity challenges. To address variability, we analyze CD variability with respect to reticle size, and its impact on parametric yield. With a cost model that incorporates mask, wafer, and processing cost considering throughput, yield, and manufacturing volume, we assess various reticle strategies...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i1.19888